Belief-based nonlinear rescoring in Thai speech understanding
نویسندگان
چکیده
This paper proposes an approach to improve speech understanding based on rescoring of N-best semantic hypotheses. In rescoring, probabilities produced by an understanding component are combined with additional probabilities derived from system beliefs. While a normal rescoring approach is to multiply or linearly interpolate with belief probabilities, this paper shows that probabilities from various sources are better combined using a nonlinear estimator. Using the proposed model together with a dialogue-state dependent semantic model shows a significant improvement when applying to a Thai interactive hotel reservation agent (TIRA), the first spoken dialogue system in Thai language.
منابع مشابه
Speech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM
This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...
متن کاملFuzzy class rescoring: a part-of-speech language model
Current speech recognition systems usually use word-based trigram language models. More elaborate models are applied to word lattices or N best lists in a rescoring pass following the acoustic decoding process. In this paper we consider techniques for dealing with class-based language models in the lattice rescoring framework of our JANUS large vocabulary speech recognizer. We demonstrate how t...
متن کاملTowards an improved model of dynamics for speech recognition and synthesis
This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech wit...
متن کاملRescoring under fuzzy measures with a multilayer neural network in a rule-based speech recognition system
In this paper, a speech rescoring system is developed on a set of phonetic hypotheses produced by a bottom-up knowledge-based decoder. An original method to automatically compute a fuzzy membership function from top-down acoustic rules statistics is compared with a possibilistic measure. To aggregate the fuzzy degrees into a phonetic score, a mutilayer neural network is trained on the results o...
متن کاملTask Dependent Loss Functions in Speech Recognition: Search over Recognition Lattices
A recognition strategy that can be matched to specific system performance criteria such as word error rate or F-measure has recently been found to yield improvements over the usual maximum a-posteriori probability strategy [1] [2] [3]. In this matched-to-the-task strategy a hypothesis is chosen to minimize the expected loss or the Bayes Risk under a loss function defined by a performance measur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004